Efficient Techniques for Crowdsourced Top-k Lists
نویسندگان
چکیده
We focus on the problem of obtaining top-k lists of items from larger itemsets, using human workers for doing comparisons among items. An example application is short-listing a large set of college applications using advanced students as workers. We describe novel efficient techniques and explore their tolerance to adversarial behavior and the tradeoffs among different measures of performance (latency, expense and quality of results). We empirically evaluate the proposed techniques against prior art using simulations as well as real crowds in Amazon Mechanical Turk. A randomized variant of the proposed algorithms achieves significant budget saves, especially for very large itemsets and large top-k lists, with negligible risk of lowering the quality of the output.
منابع مشابه
Crowdsourced Top-k Algorithms: An Experimental Evaluation
Crowdsourced top-k computation has attracted significant attention recently, thanks to emerging crowdsourcing platforms, e.g., Amazon Mechanical Turk and CrowdFlower. Crowdsourced top-k algorithms ask the crowd to compare the objects and infer the top-k objects based on the crowdsourced comparison results. The crowd may return incorrect answers, but traditional top-k algorithms cannot tolerate ...
متن کاملA Confidence-Aware Top-k Query Processing Toolkit on Crowdsourcing
Ranking techniques have been widely used in ubiquitous applications like recommendation, information retrieval, etc. For ranking computation hostile but human friendly items, crowdsourcing is considered as an emerging technique to process the ranking by human power. However, there is a lack of an easy-to-use toolkit for answering crowdsourced top-k query with minimal effort. In this work, we de...
متن کاملBest position algorithms for efficient top-k query processing
The general problem of answering top-k queries can be modeled using lists of data items sorted by their local scores. The main algorithm proposed so far for answering top-k queries over sorted lists is the Threshold Algorithm (TA). However, TA may still incur a lot of useless accesses to the lists. In this paper, we propose two algorithms that are much more efficient than TA. First, we propose ...
متن کاملTopCrowd - Efficient Crowd-enabled Top-k Retrieval on Incomplete Data
Building databases and information systems over data extracted from heterogeneous sources like the Web poses a severe challenge: most data is incomplete and thus difficult to process in structured queries. This is especially true for sophisticated query techniques like Top-k querying where rankings are aggregated over several sources. The intelligent combination of efficient data processing alg...
متن کاملBest Position Algorithms for Top-k Queries
The general problem of answering top-k queries can be modeled using lists of data items sorted by their local scores. The most efficient algorithm proposed so far for answering top-k queries over sorted lists is the Threshold Algorithm (TA). However, TA may still incur a lot of useless accesses to the lists. In this paper, we propose two new algorithms which stop much sooner. First, we propose ...
متن کامل